Undiscovered Public Knowledge: A Ten-Year Update

نویسندگان

  • Don R. Swanson
  • Neil R. Smalheiser
چکیده

Two literatures or sets of articles are complementary if, considered together, they can reveal useful information of scientik interest not apparent in either of the two sets alone. Of particular interest are complementary literatures that are also mutually isolated and noninteractive (they do not cite each other and are not co-cited). In that case, the intriguing possibility akrae that thm &tfnrmnt;nn n&wd hv mwnhXno them 4. nnvnl Lyww u-c “‘1 YLL”I&.L.sU”4L 6uy’“s. u, b..S..“Y.Ayj .a.-** Y ..u. -... During the past decade, we have identified seven examples of complementary noninteractive structures in the biomedical literature. Each structure led to a novel, plausible, and testable hypothesis that, in several cases, was subsequently corroborated by medical researchers through clinical or laboratory investigation. We have also developed, tested, and described a systematic, computer-sided approach to iinding and identifying complementary noninteractive literatures. Specialization, Fragmentation, and a Connection Explosion By some obscure spontaneous process scientists have responded to the growth of science by organizing their work into soecialties, thus permitting each individual to -r-~ focus on a small part of the total literature. Specialties that grow too large tend to divide into subspecialties that have their own literatures which, by a process of repeated splitting, maintain more or less fixed and manageable size. As the total literature grows, the number of specialties, but not in general the size of each, increases (Kochen, 1963; Swanson, 199Oc). But the unintended consequence of specialization is fragmentation. By dividing up the pie, the potential relationships among its pieces tend to be neglected. Although scientific literature cannot, in the long run, grow disproportionately to the growth of the communities and resources that produce it, combinations of implicitlyrelated segments of literature can grow much faster than the literature itself and can readily exceed the capacity of the community to identify and assimilate such relatedness (Swanson, 1993). The signilicance of the “information explosion” thus may lie not in an explosion of quantity per se, but in an incalculably greater combinatorial explosion of unnoticed and unintended logical connections. The Significance of Complementary Noninteractive Literatures If two literatures each of substantial size are linked by arguments that they respectively put forward -that is, are “logically” related, or complementary -one would expect to gain usefui information by combining them. For example, suppose that one (biomedical) literature establishes that some environmental factor A influences certain internal physiological conditions and a second literature establishes that these same physiological changes influence the course of disease C. Presumably, then, anyone who reads both literatures could conclude that factor A might influence disease C. Under such --->!L---f -----l-----ry-?r-. ----.---,a ?1-_----_I rl-conamons or comptementdnty one woum dtso expect me two literatures to refer to each other. If, however, the two literatures were developed independently of one another, the logical l inkage illustrated may be both unintended and unnoticed. To detect such mutual isolation, we examine the citation pattern. If two literatures are “noninteractive” that ir if thmv hnvm n~.rer fnr odAnm\ kppn &ml = ulyc 1U) a. “W, na6L.V ..Y.“. ,“a vva&“..n] “W.. UluIu together, and if neither cites the other, then it is possible that scientists have not previously considered both iiteratures together, and so it is possible that no one is aware of the implicit A-C connection. The two conditions, complementarily and noninteraction, describe a model structure that shows how useful information can remain undiscovered even though its components consist of public knowledge (Swanson, 1987,199l). Public Knowledge / Private Knowledge There is, of course, no way to know in any particular case whether the possibility of an AC relationship in the above model has or has not occurred to someone, or whether or not anyone has actually considered the two literatures on A and C together, a private matter that necessarily remains conjectural. However, our argument is based only on determining whether there is any printed evidence to the contrary. We are concerned with public rather than Data Mining: Integration Q Application 295 From: KDD-96 Proceedings. Copyright © 1996, AAAI (www.aaai.org). All rights reserved. private knowledge -with the state of the record produced rather than the state of mind of the producers (Swanson, 1990d). The point of bringing together the AB and BC literatures, in any event, is not to "prove" an AC linkage, (by considering only transitive relationships) but rather call attention to an apparently unnoticed association that may be worth investigating. In principle any chain of scientific, including analogic, reasoning in which different links appear in noninteractive literatures may lead to the discovery of new interesting connections. "What people know" is a common understanding of what is meant by "knowledge". If taken in this subjective sense, the idea of "knowledge discovery" could mean merely that someone discovered something they hadn’t known before. Our focus in the present paper is on a second sense of the word "knowledge", a meaning associated with the products of human intellectual activity, as encoded in the public record, rather than with the contents of the human mind. This abstract world of human-created "objective" knowledge is open to exploration and discovery, for it can contain territory that is subjectively unknown to anyone (Popper, 1972). Our work is directed toward the discovery of scientificallyuseful information implicit in the public record, but not previously made explicit. The problem we address concerns tructures within the scientific literature, not within the mind. The Process of Finding Complementary Noninteractive Literatures During the past ten years, we have pursued three goals: i) to show in principle how new knowledge might be gained by synthesizing logicallyrelated noninteractive literatures; ii) to demonstrate hat such structures do exist, at least within the biomedical literature; and iii) to develop a systematic process for finding them. In pursuit of goal iii, we have created interactive software and database search strategies that can facilitate the discovery of complementary structures in the published literature of science. The universe or searchspace under consideration is limited only by the coverage of the major scientific databases, though we have focused primarily on the biomedical field and the MEDLINE database (8 million records). In 1991, a systematic approach to finding complementary structures was outlined and became a point of departure for software development (Swanson, 1991). The system that has now taken shape is based on a 3-way interaction between computer software, bibliographic databases, and a human operator. Tae interaction generates information structtues that are used heuristically to guide the search for promising complementary literatures. The user of the system begins by choosing a question 296 Technology Spotlight or problem area of scientific interest that can be associated with a literature, C. Elsewhere we describe and evaluate experimental computer software, which we call ARROWSMITH (Swanson & Smalheiser, 1997), that performs two separate functions that can be used independently. The first function produces a list of candidates for a second literature, A, complementary to C, from which the user can select one candidate (at a time) an input, along with C, to the second function. This first function can be considered as a computer-assisted process of problem-discovery, an issue identified in the AI literature (Langley, et al., 1987; p304-307). Alternatively, the user may wish to identify a second literature, A, as a conjecture or hypothesis generated independently of the computer-produced list of candidates. Our approach as been based on the use of article titles as a guide to identifying complementary literatures. As indicated above, our point of departure for the second function is a tentative scientific hypothesis associated with two literalxtres, A and C. A title-word search of MEDLINE is used to create two local computer title-files associated with A and C, respectively. These files are used as input to the ARROWSMITH software, which then produces a list of all words common to the two sets of titles, except for words excluded by an extensive stoplist (presently about 5000 words). The resulting list of words provides the basis for identifying title-word pathways that might provide clues to the presence of complementary arguments within the literatures corresponding to A and C. The output of this procedure is a structured titledisplay (plus journal citation), that serves as a heuristic aid to identifying word-linked titles and serves also as an organized guide to the literature.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Undiscovered Public Knowledge: The Potential of Research Synthesis Approaches in Tourism Research

Thirty years ago, Donald Swanson coined the term ‘Undiscovered Public Knowledge’ to refer to the product of the synthesis of previous research. In the current climate, in a range of disciplines relevant to tourism research (e.g. policy, management, economics and psychology) there has been an increasing interest in the maximisation and re-use of previous research through a range of research synt...

متن کامل

INFO 662/Final Assignment: Annotated Bibliography of Automated Metadata Extraction in Literature-based Knowledge Discovery Overview of LBKD

Don R. Swanson (19242012) pioneered the field of literature-based knowledge discovery (LBKD), which uses existing research to create new knowledge [Swanson 1986]. He believed that unearthing unseen links between two distinct areas of study could yield new discoveries—what he called “undiscovered public knowledge.” He wanted to demonstrate the presence of these “undiscovered connections” between...

متن کامل

A bird\'s-eye view to Urmia Medical Journal, 2016-2019: an update

Dear editor Periodical assessment and monitoring of journal statistics by editor in chief and other related editorial board  bring important insight to determine the quality of scientific production process and provide detail if a journal is paving the way to  join the mainstream internationally recognized indexing databases such as ISI [www.webofknowledge.com], Medline [www.Pubmed.org] and Sc...

متن کامل

A Semantic Approach for Mining Hidden Links from Complementary and Non-interactive Biomedical Literature

Two complementary and non-interactive literature sets of articles, when they are considered together, can reveal useful information of scientific interest not apparent in either of the two sets alone. Swanson called the existence of such hidden links as undiscovered public knowledge (UPK). The novel connection between Raynaud disease and fish oils was uncovered from complementary and non-intera...

متن کامل

Association of Behçet’s Disease with Osteogenesis Imperfecta in A Ten-Year-Old Girl

Osteogenesis Imperfecta (OI) is a genetic disorder characterized by bones that break easily, often from little or no apparent cause. In this article, we present a patient suffering from OI, who had concomitant active Behçet’s Disease(BD)with repeated oro-genital ulcers, skin postular eruptions and severe recurrent bilateral uveitis. This patient, is, to our knowledge the first reported case in ...

متن کامل

A Ten-year Report of Drug and Poison Information Center in Mashhad, Iran 2007-2017

The Mashhad drug and poison information center (MDPIC) was officially established in 2000 to provide up-to-date information on medications. The objective of this study is to provide an epidemiologic profile of drug inquiry and poisoning-related phone calls to MDPIC from 2007 to 2017. This article is a descriptive retrospective study in which all inquiries about drugs and poisoning cases receive...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996